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Abstract 

It is now widely accepted that the CMOS technology 
implementing irreversible logic will hit a scaling limit 
beyond 2016, and that the increased power dissipation 
is a major limiting factor. Reversible computing can 
potentially require arbitrarily small amounts of energy. 
Recently several nano-scale devices which have the po- 
tential to scale, and which naturally perform reversible 
logic, have emerged. This paper addresses several fun- 
damental issues that need to be addressed before any 
nano-scale reversible computing systems can be real- 
ized, including reliability and performance trade-offs 
and architecture optimization. Many nano-scale de- 
vices will be limited to only near neighbor interactions, 
requiring careful optimization of circuits. We provide 
efficient fault-tolerant (FT) circuits when restricted to 
both 2D and ID. Finally, we compute bounds on the 
entropy (and hence, heat) generated by our FT circuits 
and provide quantitative estimates on how large can we 
make our circuits before we lose any advantage over ir- 
reversible computing. 

1 Introduction 

There are several compelling reasons for a renewed 
interest in reversible computing systems: First, it is 
now widely accepted that the CMOS technology im- 
plementing irreversible logic will hit a scaling limit be- 
yond 2016, and the increased power dissipation is a 
major limiting factor. Reversible computing[21 1171 E] 
can potentially require zero or very little energy. Sec- 
ond, several new nano-scale devices which have the po- 
tential to scale, and which naturally perform reversible 
logic, have emerged. This paper addresses several fun- 
damental issues that need to be addressed before any 
nano-scale reversible computing systems can be real- 
ized, including: 



1. Reliability and Performance Trade-offs: Current 
nano-scale logic proposals appear to provide ex- 
tremely unreliable devices, requiring extensive use 
of fault-tolerant (FT) circuits. We provide a sys- 
tematic design for reversible FT circuits, which 
will work reliably even if each gate has an error 
probability as high as 1/108. We also calculate the 
blow up in size and gate count that would result 
from the use of such FT circuits. 

2. Architecture Optimization: many of the proposed 
nano-scale devices will be limited to only near- 
neighbor interactions |12l ITUl I19| . requiring careful 
circuit optimization. We provide efficient FT cir- 
cuits when restricted to both 2D and ID. We show 
that for a 2D topology with near neighbor connec- 
tions, the error threshold decreases only to 1/273, 
and that a ID lattice that is 27 bits wide but ar- 
bitrarily long has an error threshold only 23% less 
than the full 2D case. 

3. Power Dissipation: Reversible computing systems 
in the presence of errors will generate heat. We 
compute bounds on the entropy (and hence, heat) 
generated by our FT circuits and provide quanti- 
tative estimates on how large can we make our cir- 
cuits before we lose any advantage over irreversible 
computing. 

Many previous works have considered gate-level 
fault tolerance techniques for irreversible gates |181 1131 
E3 Ej • Local fault tolerance schemes for irreversible au- 
tomata have also been studied Quantum comput- 
ers are reversible; however, the properties of quantum 
errors and quantum information arc sufficiently differ- 
ent from the classical case that fault-tolerant quantum 
computational ^] is not directly applicable to the 
traditional reversible classical computing model, which 
is the subject of this work. 

This paper is organized into three main parts. In 
Section |2 we give our model of noisy reversible gates 
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and give a circuit fault-tolerant error-recovery and 
compute the overhead that the circuit requires both 
in time and space. We show that, as long as the er- 
ror rate is below a threshold, the circuit can be made 
reliable. In Section [3 we apply the results of Section 
121 to locally connected models where bits can only op- 
erate on their nearest neighbors. We consider the lo- 
cal problem in ID and 2D. In Section 0] we compute 
how much entropy and heat is dissipated in the error- 
recovery process, and we see that the entropy saving 
aspect of reversible computing is lost in our scheme 
once the error rate gets close to the threshold. 

2 Reversible majority multiplexing 
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Figure 1. The reversible Majority gate con- 
structed from two controlled-not gates and 
one Toffoli gate. The horizontal lines repre- 
sent bits. Each vertical line connecting hor- 
izontal lines represents a gate. If every bit 
connected by a filled dark circle on a particu- 
lar vertical line has the value one, the bit on 
the horizontal line with the © is flipped. Time 
flows from left to right. 



In standard irreversible computing, we often imag- 
ine that functions are represented by circuits of wires 
and gates at fixed positions. We can think of the bits 
as moving through the wires to different gates. In re- 
versible computing, since gates have identical numbers 
of input bits and output bits, we have a choice: we can 
picture the bits moving through wires to gates at fixed 
locations, or picture the bits as fixed locations in space 
and the reversible function as a sequence of reversible 
gates applied on the bits. This is commonly represented 
as the gate array notation where space is on the y-axis 
and time is on the x-axis, and operations are boxes or 
symbols that connect the bits they are applied to (see, 
for instance Figure^). This model is realizable in many 
nanocomputing proposals |121 I101 I19) 1 

The error model is a simple independent gate failure 
model: at each application, a gate will randomize all 
the bits it is applied to with probability g. While this 
model does not directly address correlated failures, it 
will apply as long as the probability that k out of G 
gates fail is less than (9)g k (l — g) G ~ k ■ For most of 
this paper, we will assume that this error rate applies to 
any three-bit gate, and that we have access to no fault- 
free operations. Our goal is to make larger modules of 
T reversible gates with a module error rate which is 
independent of T, and is in general much smaller than 
1 — (1 — g) T w gT (which is what we could expect 
without any fault tolerance). In fact, we show that by 
using 0(T log 4 75 T) gates instead of just T, we obtain 
an error rate that is constant in T. 

Rather than explicitly deal with error correction 
codes, the best gate-level, fault-tolerant schemes for 
classical computing are those based on Von-Neumann 
multiplexing JSj- I n this case, each bit is copied to a 

1 It should be noted that, since all quantum computers are 
reversible, this work is particularly applicable to classical pro- 
cessing on quantum hardware. 
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Table 1 . The truth table for the reversible MAJ 
gate. Note that each input has a unique out- 
put, and that the first bit of the output is the 
majority of the input bits. This gate is ob- 
tained by flipping the second two bits if the 
first bit is 1, and then flipping the first bit if 
the second two bits are 1 . 



second bit, a random permutation is applied, and fi- 
nally, a NAND gate is applied to each pair of bits. 
These approaches make use of the irreversible NAND 
gate. Schemes such as this can result in fault-tolerant 
computation as long as the gate error rate is less than 
about ll%FJJj. 

Rather than base our multiplexing scheme on 
NAND or some derivative, we base ours on a reversible 
extension of the majority gate (MAJ), depicted in 
terms of CNOT and Toffoli in Figure^f . The reversible 
majority gate (MAJ), has a truth table given in Table 
□ 

We claim that the circuit in Figure is in fact a 

2 We note that variants of the MAJ gate have found appli- 
cation in algorithmic cooling|5] and reversible addition^; thus 
MAJ appears to be a valuable gate for reversible and quantum 
computers. 
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Figure 2. A reversible multiplexing scheme 
based on the 3-bit repetition code. The out- 
put bits are those bit positions 0,3,6. The rest 
may be discarded. This circuit is the error- 
recovery circuit referred to as E L in Figure 
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2.1 Concatenation 

In order to suppress the probability of error, we code 
one bit as three bits in a repetition code. But why stop 
there? We could also code each of the three bits into 
another repetition code, resulting in nine bits, and so 
forth. This method is called concatenation. 

We say a physical bit is level 0, a three-bit code 000 
or 111 is level 1, a nine bit code is level 2, a 3 L bit 
code is level L. A bit encoded at level L is made up 
of three bits encoded at level L — 1. A gate at level 
is a physical gate to which we have access (in our 
case, MAJ). To implement a logical gate at level 1, 
we apply the gate at level to each of the three bits in 
the code for level 1. To implement a 3-bit gate at level 
L, we apply the gate at level L — 1 on the logical bits 
in the code, and then correct any errors we may have 
caused in each of the bits. This is depicted in Figure 

m 

2.2 The threshold for fault-tolerant computation 



fault-tolerant error correction circuit. To see this, con- 
sider the code space of 000 representing logical zero 
(0l) and 111 (1^) representing logical one. Consider 
Figure|21as having two phases, encoding and decoding. 
The first three MAJ -1 gates are the encoding gates 
and the last three MAJ gates are decoding gates. Af- 
ter the encoding gates the bits should all have the same 
value (i.e. 000000000 or 111111111 if the input was L 
or 1^, respectively). Decoding puts the majority value 
of each block of three bits into the three output bits. 
Clearly, if there are no errors, the circuit should out- 
put exactly what was input, due to the symmetry of 
the code under permutations. Now we simply observe 
that, if any single error occurs, it will change at most 
one bit in each of the final decoder blocks. Since the 
decoders return the majority result in their output bit, 
a single bit flip will not change the majority result. If 
one of the final MAJ gates has an error, it will only 
effect one bit in the output, and that can be repaired 
in the next error-recovery cycle. Thus we have a fault- 
tolerant error-recovery because we can tolerate errors 
in our recovery gates. 

Since the codewords in this system are repetition 
code words, we can use any universal, reversible set of 
gates for computation directly on the repetition code- 
words. After each gate operation, we apply our error- 
recovery circuit from Figure |3 Now that we have a 
circuit, we can ask: for what error rates will this cir- 
cuit perform as expected? 



Throughout this work, we consider the gate error 
rate g as the error rate for a 3-bit operation. Thus, we 
assume that we can reset three bits with one initializa- 
tion operation. The error-recovery circuit depicted in 
Figure |21 requires us to initialize six bits (two 3-bit ini- 
tialization operations), apply three MAJ^ 1 gates, and 
three MAJ gates for a total of eight gate operations 
(six if initialization can be assumed to be far more ac- 
curate than our gates). As previously shown, as long 
as there is no more than one error in all of these op- 
erations, the final result will not be an error. We say 
that the error-recovery circuit requires E gates. 

In addition to the error-recovery, we also have the 
logical gate which we want to apply on the data. To 
apply our logical gate, we need to operate on each of 
the three bits in the code, which gives us three more 
gates which can go wrong. 

A particular bit will be correct unless there are two 
or more errors. Thus, if each operation has an error 
rate g and there are G = 3 + E operations acting on 
each encoded bit (some act on more than one encoded 
bit), the bit error rate Put is: 




G-k 



The probability that a gate has no error is at least 
as large as the probability that none of the bits are in 
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error if each were considered independently: 

1 — 9logical > (1 — Pbit) 3 

The above is true because the right hand side triple 
counts the case where the logical gate (the gate applied 
to all three bits) fails. We note that the above bound 
is a convenient bound, but a tighter bound will result 
in an improved error threshold. We can use the above 
to see that: 
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Thus if we want gi og icai < g, it is sufficient to have 
9 < 3^J' which we call the threshold. In our cases 

of G = 11 (3 + (E = 8)) and G = 9 (3 + (E = 6)), 
we get threshold results of p = 1/165 and p — 1/108, 
respectively. 

The above only shows that we can decrease the log- 
ical error rate by applying the error-recovery circuit; 
it does not show that we can make it as small as we 
like. In order to push the error rate to lower values, we 
concatenate our bits recursively. At the lowest logical 
level, L = 0, each bit is represented by a physical bit. 
At all other levels, each bit at level L is represented 
by 3-bits at level L — 1. Thus, at level L we are actu- 
ally using 3 L physical bits. We only pay attention to 
the error rates at the largest level, but after each gate 
at level L we do an error-recovery at level L, which 
in turn applies gates at level L — 1, and so on, until 
we reach the bottom level. This recursive structure is 
represented in figure El 

Thus, if the error probability at level k is g k , go = 9- 
and we have G total operations to perform, Equation 
n tells us that: 



9k+i 



< 



9l 



To solve equations like the one above, we introduce 
K = log3( 2 ), and r k = logg fe : 



9k+i < 3 



logg/c+i < log 3 
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9k 



21ogg fc 



r k+1 < K + 2r k 
(r k+ i + k) < 2(r k + k) 
< 2 k {n+n) 

(H G 2 )9of 



r k 
9k 
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Figure 3. A gate at concatenation level L. We 
apply the gate three times at level L — 1 and 
then do an error correction cycle. G L i is the 
gate operation at the lower logical level, E L is 
the error-recovery circuit (FigureHJ on logical 
level L (using only gates at level L - 1). 
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Assuming G = 9, 3(?) = 108; thus, if the lowest level 
gate error is go < 1/108, we can correct all the errors 
with high probability if we use enough levels of concate- 
nation. Throughout this paper we use p for threshold 
values. Thus, if we define p = l/3( 2 ), and if go < p 
we are sure to be able to correct all errors in the limit 
of large L. 

2.3 Circuit blowup 

In this section, we consider how much larger a mod- 
ule made of T perfect gates will be when constructed 
from the FT techniques of this paper, such that the 
final module will have a constant probability of error 
irrespective of T. 

To implement a 3-bit gate at level L, we must correct 
3-bits at level L — 1. Starting from Figure |3 we can 
see that if is the number of gates required for one 
complete error correction and gate operation on levels 
k and lower, and E is the number of gates required to 
do error-recovery, then we have that: 

r fe = 3r fc _! + 3MV! 
= 3(i + £)r fe _! 



Using G = 3 



= {i{l + E)) k 
E, we have: 

T k = (3(G-2)) h 



3(?) 



So there is an exponential blow-up in gate count with 
concatenation depth. The bit size blowup is just as 
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easy to calculate: at each level, we use 9 bits of the 
level below: 



S k 



9S k 
9 k 



How deep do we need to concatenate? If the module 
we want to simulate has T gates, we need g^ < 1/T in 
order to have less than on average one error in the FT 
module. Hence, by bounding Equation^ 



1 

< — 
_ T 

L > log 2 



logTp 



(3) 



32 logp/9 

If we use the minimum valid value for L, the gate blow- 
up factor is Tl = 0(G L ), and is poly-log in T: 

(3(G-2)) L = 2 Llog 2 3 ( G - 2 ) 



log 2 3(G-2) 



log 2 9 



3.17 



As is the size blow-up factor: 

S L =9 L = 2 ilog 2 9 

logp/9 

logp/9 

For G = 11, we have (3(G - 2)) L = 0((logT) 4 - 75 ) and 
S L = 0((logT) 317 ). 

Suppose we have g = p/10 with G = 9 and p w 
10~ 2 . Without any error correction, modules larger 
than 1,000 gates will almost certainly be faulty. If 
we want to make a module of T = 10 6 , we need 
L = log 2 ((log 2 10 4 )/log 2 10) = 2, so we can make 
an accurate module with 10 6 gates, using 2 levels of 
concatenation, if g = p/10. This means that, rather 
than using one gate, we will need to replace each with 
(3(G — 2)) 2 = 441 gates and replace each bit with 
3 2 = 81 bits. However, we are able to construct a much 
larger module: 10 6 logical gates rather than 1, 000 log- 
ical gates. 

We have shown that we can take noisy gates and 
create modules of bounded noise with only a poly-log 
overhead factor. Once we have modules with bounded 
noise, higher level fault tolerance techniques may be 
applied. 
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In this section, we consider the problem of an array 
of bits on which we may operate with noisy, reversible 



gates. We assume that we may only operate on at most 
three neighboring bits at a time. When it is necessary 
to operate on pairs of remote bits, we must first move 
them close together by a series of SWAP operations 
and then operate. This introduces extra overhead into 
every logical operation. Since we are assuming that 
each operation is noisy we expect this to reduce the 
error threshold. We will first consider a 2D array and 
see that the 2D array only requires extra SWAP op- 
erations to operate on three logical bits, and no extra 
SWAP operations to do error-recovery. As such, the 
threshold is not much lower than the result of Section 
12.21 Later, we consider a ID array; in this case, our 
error correction circuit will require many SWAP oper- 
ations. We can expect this to lower the threshold much 
more than the 2D case. 

3.1 A local 2-dimensional fault-tolerant system 

In Figure El we presented a fault-tolerant error cor- 
rection circuit which assumes that any pair of bits may 
be operated on; notice, however, that during the recov- 
ery process, only certain bits interacted. In practice, 
many systems may allow only local operations. In this 
section we consider a 2D lattice of bits. At each time 
step any adjacent pairs (or triples) of bits may interact. 

If we put the circuit from Figure [21 on the lattice in 
Figure 0J we see that all the bits that interact in the 
recovery circuit are already near one another in the 
lattice. The only additional complication we need to 
consider is the difficulty of bringing three logical bits 
near one another in order to do a logical operation. 

To operate on logical bits, we must interleave the 
bits, operate locally, and then uninterlcave. But in 
which direction do we interleave? There are two direc- 
tions: parallel and perpendicular to the logical bit line 
(see Figure^}. Interleaving three logical bits parallel to 
the logical line requires nine SWAP gates. Interleaving 
three logical bits perpendicular to the logic line requires 
12 SWAP gates. However, both schemes use at most 
six SWAPs on a given logical bit. If we combine two 
SWAPs into one three bit gate, which we call SWAP 3 , 
depicted in Figure 03 (and then only count three bit 
gates in the threshold) we only use three SWAP 3 gates. 

Thus, a full cycle now requires six additional gate 
operations: three to interleave, three to uninter leave 
the bits 3 . Thus our total gate count is 14 if we ignore 
bit initialization, and 16 if we include bit initialization. 
As such the threshold using only local operations in 



3 The error-recovery circuit in Figure |2] actually rotates the 
logical bit line, but as long as all bits are recovered at the same 
time, this rotation is uniform throughout the circuit and can be 
ignored 
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Figure 4. Layout of bits on a 2D lattice. Boxes denote the locations of the logical bits; the other bits 
are ancillary bits. To bring logical bits close to one another, we can either move them perpendicular 
to the logical line (q0,q1,q2) by swapping them past the ancillary bits in between two logical lines 
(q3,q4,q5 for one bit and q6,q7,q8 for the other), or we can move them parallel to the logical line, in 
which case the two logical bits are adjacent to each other in a line and must be interleaved (q0,q1,q2 
with the next q0,q1,q2 just below it in the above figure). 
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Figure 5. A SWAP 3 gate, which is composed 
of two swaps on three bits. Since this gate 
only acts on three bits, we assume that its 
error rate is at most g, the error of any 3-bit 
gate. 
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Figure 6. Interleaving three codewords which 
are linearly adjacent. 



2D becomes p 2 = 1/3 (If) = 1/273 and p 2 = 1/3 = 
1 /360 respectively. Clearly, if we can initialize bits with 
error probability very much lower than 1/273, they can 
be ignored in the threshold calculation, and we can 
assume that the gate error rate only needs to reach the 
larger threshold, which is approximately 0.4%. 

3.2 A local 1 -dimensional fault-tolerant system 

Figure [7| uses only nearest-neighbor operations on a 
ID array. The error correction circuit requires six MA J 
gates, nine SWAPs, and six initializations. Instead of 
counting nine SWAPs, we count four SWAP 3 gates 
and one SWAP. Instead of counting six initializations 
we count two 3-bit initializations. This gives a total 
of 11 gates or 13 gates, with or without initialization, 
respectively. 

We want to balance the number of SWAPs applied 
to each codeword so that no codeword is corrupted 
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Figure 7. A fault-tolerant error-recovery cir- 
cuit in one dimension with only local gate op- 
erations. This may be thought of as Figure|2] 
plus Figure©handling the the interleaving of 
the bits. 



more than the other two. As such, we interleave by 
bringing the two outer codewords close to the mid- 
dle codeword. In order to interleave three logical bits 
60, 61, 62, we move the last bit in 60 just above the last 
bit in 61. Then we move the second bit in 60 just above 
the second bit in 61. Next, we move the first bit in 60 
just above the first bit in 61. Subsequently, we do a 
similar operation on 62. 

Interleaving 60 and 61 requires 8 + 7 + 6 SWAPs (8 
for the last bit, 7 for the second bit, 6 for the first bit). 
Interleaving 62 requires 10 + 8 + 6 SWAPs (10 for the 
first bit, 8 for the second, and 6 for the last). This 
gives a total of 45 SWAPs; however, at most 24 act on 
a single bit. If, instead of counting SWAPs, we count 
SWAP 3 gates, we have only 12 SWAP 3 gates acting 
on each codeword to interleave. 

Thus, a full operation is 12 SWAP 3 on each code- 
word to interleave, the gate operation, which touches 
each of the three bits in each codeword, and finally 12 
SWAP 3 gates to uninterleave, for a total of 27 gates, in 
addition to the error-recovery cycle. The error-recovery 
cycle requires 13 gates (11 if we neglect initialization), 
for a total of 40 gates. This yields a threshold of 
px = 1/3 ( 4 2 °) = 1/2340 (or p x = 1/2109 if bit initial- 
ization is much more accurate than pi). Thus, we find 
that p\ is about an order of magnitude worse in the ID 
case than it is in the 2D case. In the next section, we 
will see how using a few levels of 2D at the lowest level 
can recapture most of the advantage that 2D offers in 
the threshold. 
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3.3 Concatenating different thresholds 

If a particular fault-tolerant scheme has a threshold 
P2 and the elementary gates have an error rate of g, 
Equation|21tells us that, after k levels of concatenation, 
the resulting error rate is less than: 



9k = P2 



If, after k levels of this scheme, we concatenate with 
L — k levels of a scheme with threshold pi, we have: 



9l = Pi — 



Pi 



Pi 



Pi 



P2 



Pi 



Pi 



with p(k) = p2 2 ■ Thus, if we use k levels of 
a scheme with a lower threshold, we can get most of 
the advantages of a lower threshold, even with a small, 
finite number of concatenations. Table shows that, 
after a few levels of 2D concatenation, the threshold 
approaches the 2D case studied in Section ETT1 In par- 
ticular, a linear array nine bits wide has a threshold 
60% as large as the full 2D case, and an array 27 bits 
wide has a threshold 77% as large as the full 2D case. 
This underscores the fact that most of the benefits of 
a 2D structure accrue in the first few levels of concate- 
nation. 

4 Entropy dissipation 

Reversible computing has been proposed as a 
method to reduce power consumption of computing de- 
vices. In some quantum systems, reversible logic is all 
that is available, and irreversible devices must be sim- 
ulated from reversible ones (by discarding or resetting 
bits). However, a Toffoli gate can simulate an irre- 
versible NAND gate by dissipating at most 3/2 bits of 
entropy per cycle 4 . Similarly, due to the universality of 

4 The value of 3/2 bits is in fact optimal (assuming equally 
likely inputs and using only reversible logic), and may be 
achieved using the MAJ" 1 gate. 
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Table 2. If we concatenate k levels of 2D cir- 
cuits with L - k levels of 1D circuits, we see 
that the threshold rapidly approaches the 2D 
case. If we are allowed 27 lines rather than 
just one, we can get a threshold which is only 
23% smaller than 2D case. 



NAND, we can use NAND gates to build a functionally 
equivalent gate to Toffoli at a entropic expense of only 
a few bits. Hence, once our encoded gates dissipate 3/2 
bits of entropy per operation, we can say that we have 
actually used faulty, reversible gates to build fault-free 
irreversible logic. 

It has been shown that if a reversible computer has 
errors, there must be a supply of fresh zero bits in or- 
der to remove entropy from the computer £Q. Here, 
we estimate how much entropy per gate must be dis- 
sipated during fault-tolerant operation of a noisy re- 
versible computer. We note that when n bits have nxH 
bits of entropy, it is not necessary to replace them with 
n zero-entropy bits; instead, reversible cooling[T5ll3"ll5] 
schemes can ensure than we only need to replace nx H 
of them with zero-entropy bits. Thus, asymptotically, 
we need to calculate the expected amount of entropy 
in the ancillary bits, and that will correspond to the 
number of bit resets we will need in our system. 

Landauer pointed out that where there is ir- 
reversibility in computing, there must be heat 
dissipation|ll|. Thus, by computing the amount of en- 
tropy dissipated, we know that the heat dissipated is: 

AE > k b tAH 

where AH is the amount of entropy dissipated, kb is 
Boltzmann's constant, and t is the temperature. 

We represent the number of gates at level L — 1 
needed to simulate a gate at level L as G. The value of 
G will depend on the model we are working with, e.g. 
non-locally connected vs. locally connected. Following 
the calculations in Section l2.3l and due to subadditivity 
of entropy, if Hl is the entropy generated by one gate 
operation at level L, we can see that: 



GHl-i 
G L - 1 H 1 
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Not every error is distinguishable, so assuming each 
error can be distinguished provides an upper bound. 
When a gate has an error, we assume it outputs totally 
random bits; thus, with probability 1 — g, the output 
is correct, and with probability g the output is one of 
eight equally likely outputs. Calculating the entropy 
of such a scenario yields H(-g-) + log 7, and we find: 

(;r ,/| + ^ log7 



G 




If we define k 



log 7 L/S 



I log 7 ) , then we have: 



H L < G L Ky /g 



To obtain a lower bound, we recall that we assume 
independent errors. From Figure |3| we can see that 
errors in the recovery process are independent of errors 
in different logical bits. After each gate at level L > 0, 
we do error-recovery, and this is where the entropy is 
removed by means of bit resets. If there are E gates in 
the recovery process, we know that: 

= (SE)^ 1 ^ 

We note that every gate touches at least one an- 
cillary bit. With probability g the physical gate fails. 
When the gate fails, that bit will be flipped with prob- 
ability 1/2; thus, the entropy of all the ancillary bits is 
at least: 

Hi > H(g/2) > g 

So we have: 

Putting both the upper and lower bounds together: 

g^Ef- 1 <H L < G L nJ-g 

If we want to have 0(1) bits of entropy per gate, by 
bounding the left side of the above equation, we must 
have: 



g(3E) L ^ < 1 
logs + {L- 1) log 3E < 



L < 



log 



, l 



log 3E 



For example, if g = 10 2 , and E = 11, we have L < 2.3. 



We see that the entropy-saving aspect of reversible 
computing is indeed highly sensitive to error. Both the 
upper and lower bounds of entropy per gate are expo- 
nential in L for fixed g. At the same time, we see that, 
even if there is some small finite error with g <C 1, the 
entropic savings relative to irreversible computing may 
be obtained by using 0(log ^) levels of error correction. 

5 Conclusion 

We have given a method of producing fault-tolerant 
reversible circuits. We also considered this prob- 
lem in which only local communications are allowed, 
which we believe will be very valuable for quantum 
computing systems that need to perform some clas- 
sical processing without having to resort to quantum 
measurements [T^J- We also note that the circuits and 
threshold values presented here represent an lower 
bound on the threshold for reversible, fault-tolerant 
logic. There may exist improved schemes which could 
improve the threshold values; however, the circuits here 
provide an existence proof. 

While we showed how to make fault-tolerant, re- 
versible circuits, we also saw that, when the error rate 
is near the threshold, there is considerable cost (in bits, 
gates, and entropy) to the error correction procedure. 
While it was already known that, in principle, any 
noisy reversible computer must dissipate entropy [P, 
our circuits provide a useful upper bound on how much 
entropy must be released in the computing process with 
noisy reversible gates. 

The authors would like to thank Michael Frank 
for many helpful comments, and Thomas Szkopek for 
pointing out an error in an earlier draft. 
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